162 research outputs found

    Faster K-Means Cluster Estimation

    Full text link
    There has been considerable work on improving popular clustering algorithm `K-means' in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose a fast heuristic to overcome this bottleneck with only marginal increase in MSE. We observe that across all iterations of K-means, a data point changes its membership only among a small subset of clusters. Our heuristic predicts such clusters for each data point by looking at nearby clusters after the first iteration of k-means. We augment well known variants of k-means with our heuristic to demonstrate effectiveness of our heuristic. For various synthetic and real-world datasets, our heuristic achieves speed-up of up-to 3 times when compared to efficient variants of k-means.Comment: 6 pages, Accepted at ECIR 201

    Neural Network Methods for Boundary Value Problems Defined in Arbitrarily Shaped Domains

    Full text link
    Partial differential equations (PDEs) with Dirichlet boundary conditions defined on boundaries with simple geometry have been succesfuly treated using sigmoidal multilayer perceptrons in previous works. This article deals with the case of complex boundary geometry, where the boundary is determined by a number of points that belong to it and are closely located, so as to offer a reasonable representation. Two networks are employed: a multilayer perceptron and a radial basis function network. The later is used to account for the satisfaction of the boundary conditions. The method has been successfuly tested on two-dimensional and three-dimensional PDEs and has yielded accurate solutions

    Artificial Neural Networks for Solving Ordinary and Partial Differential Equations

    Full text link
    We present a method to solve initial and boundary value problems using artificial neural networks. A trial solution of the differential equation is written as a sum of two parts. The first part satisfies the boundary (or initial) conditions and contains no adjustable parameters. The second part is constructed so as not to affect the boundary conditions. This part involves a feedforward neural network, containing adjustable parameters (the weights). Hence by construction the boundary conditions are satisfied and the network is trained to satisfy the differential equation. The applicability of this approach ranges from single ODE's, to systems of coupled ODE's and also to PDE's. In this article we illustrate the method by solving a variety of model problems and present comparisons with finite elements for several cases of partial differential equations.Comment: LAtex file, 26 pages, 21 figs, submitted to IEEE TN

    Artificial Neural Network Methods in Quantum Mechanics

    Full text link
    In a previous article we have shown how one can employ Artificial Neural Networks (ANNs) in order to solve non-homogeneous ordinary and partial differential equations. In the present work we consider the solution of eigenvalue problems for differential and integrodifferential operators, using ANNs. We start by considering the Schr\"odinger equation for the Morse potential that has an analytically known solution, to test the accuracy of the method. We then proceed with the Schr\"odinger and the Dirac equations for a muonic atom, as well as with a non-local Schr\"odinger integrodifferential equation that models the n+αn+\alpha system in the framework of the resonating group method. In two dimensions we consider the well studied Henon-Heiles Hamiltonian and in three dimensions the model problem of three coupled anharmonic oscillators. The method in all of the treated cases proved to be highly accurate, robust and efficient. Hence it is a promising tool for tackling problems of higher complexity and dimensionality.Comment: Latex file, 29pages, 11 psfigs, submitted in CP

    Privacy Preserving Multi-Server k-means Computation over Horizontally Partitioned Data

    Full text link
    The k-means clustering is one of the most popular clustering algorithms in data mining. Recently a lot of research has been concentrated on the algorithm when the dataset is divided into multiple parties or when the dataset is too large to be handled by the data owner. In the latter case, usually some servers are hired to perform the task of clustering. The dataset is divided by the data owner among the servers who together perform the k-means and return the cluster labels to the owner. The major challenge in this method is to prevent the servers from gaining substantial information about the actual data of the owner. Several algorithms have been designed in the past that provide cryptographic solutions to perform privacy preserving k-means. We provide a new method to perform k-means over a large set using multiple servers. Our technique avoids heavy cryptographic computations and instead we use a simple randomization technique to preserve the privacy of the data. The k-means computed has exactly the same efficiency and accuracy as the k-means computed over the original dataset without any randomization. We argue that our algorithm is secure against honest but curious and passive adversary.Comment: 19 pages, 4 tables. International Conference on Information Systems Security. Springer, Cham, 201

    Application of Relevance Feedback in Content Based Image Retrieval Using Gaussian Mixture Models

    Full text link

    Supersymmetric hybrid inflation in the braneworld scenario

    Get PDF
    In this paper we reconsider the supersymmetric hybrid inflation in the context of the braneworld scenario . The observational bounds are satisfied with an inflationary energy scale μ≃4×10−4Mp\mu\simeq 4\times 10^{-4}M_p, without any fine-tuning of the coupling parameter, provided that the five-dimensional Planck scale is M5∼<2×10−3MpM_5\stackrel{<}{\sim} 2\times 10^{-3}M_p . We have also obtained an upper bound on the the brane tension .Comment: 8 pages (Latex

    A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. In this paper, we first present an overview of these methods with an emphasis on their computational efficiency. We then compare eight commonly used linear time complexity initialization methods on a large and diverse collection of data sets using various performance criteria. Finally, we analyze the experimental results using non-parametric statistical tests and provide recommendations for practitioners. We demonstrate that popular initialization methods often perform poorly and that there are in fact strong alternatives to these methods.Comment: 17 pages, 1 figure, 7 table

    Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.196
    • …
    corecore